AITopics | confidential review copy

Collaborating Authors

confidential review copy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proxy Prompt: Endowing SAM and SAM 2 with Auto-Interactive-Prompt for Medical Segmentation

Xinyi, Wang, Hongyu, Kang, Peishan, Wei, Li, Shuai, Sun, Yu, Lam, Sai Kit, Zheng, Yongping

arXiv.org Artificial IntelligenceFeb-5-2025

In this paper, we aim to address the unmet demand for automated prompting and enhanced human-model interactions of SAM and SAM2 for the sake of promoting their widespread clinical adoption. Specifically, we propose Proxy Prompt (PP), auto-generated by leveraging non-target data with a pre-annotated mask. We devise a novel 3-step context-selection strategy for adaptively selecting the most representative contextual information from non-target data via vision mamba and selective maps, empowering the guiding capability of non-target image-mask pairs for segmentation on target image/video data. To reinforce human-model interactions in PP, we further propose a contextual colorization module via a dual-reverse cross-attention to enhance interactions between target features and contextual-embedding with amplifying distinctive features of user-defined object(s). Via extensive evaluations, our method achieves state-of-the-art performance on four public datasets and yields comparable results with fully-trained models, even when trained with only 16 image masks.

artificial intelligence, machine learning, segmentation, (11 more...)

arXiv.org Artificial Intelligence

2502.03501

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Uncovering the Text Embedding in Text-to-Image Diffusion Models

Yu, Hu, Luo, Hao, Wang, Fan, Zhao, Feng

arXiv.org Artificial IntelligenceApr-1-2024

The correspondence between input text and the generated image exhibits opacity, wherein minor textual modifications can induce substantial deviations in the generated image. While, text embedding, as the pivotal intermediary between text and images, remains relatively underexplored. In this paper, we address this research gap by delving into the text embedding space, unleashing its capacity for controllable image editing and explicable semantic direction attributes within a learning-free framework. Specifically, we identify two critical insights regarding the importance of per-word embedding and their contextual correlations within text embedding, providing instructive principles for learning-free image editing. Additionally, we find that text embedding inherently possesses diverse semantic potentials, and further reveal this property through the lens of singular value decomposition (SVD). These uncovered properties offer practical utility for image editing and semantic discovery. More importantly, we expect the in-depth analyses and findings of the text embedding can enhance the understanding of text-to-image diffusion models.

diffusion model, editing, image editing, (15 more...)

arXiv.org Artificial Intelligence

2404.01154

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

A Modular End-to-End Multimodal Learning Method for Structured and Unstructured Data

Alessandro, Marco D, Calabrés, Enrique, Elkano, Mikel

arXiv.org Artificial IntelligenceMar-7-2024

Multimodal learning is a rapidly growing research field that has revolutionized multitasking and generative modeling in AI. While much of the research has focused on dealing with unstructured data (e.g., language, images, audio, or video), structured data (e.g., tabular data, time series, or signals) has received less attention. However, many industry-relevant use cases involve or can be benefited from both types of data. In this work, we propose a modular, end-to-end multimodal learning method called MAGNUM, which can natively handle both structured and unstructured data. MAGNUM is flexible enough to employ any specialized unimodal module to extract, compress, and fuse information from all available modalities.

confidential review copy, modular end-to-end multimodal learning method, submission, (9 more...)

arXiv.org Artificial Intelligence

2403.04866

Genre: Research Report (0.40)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

Design and Implementation of English To Yor\`ub\'a Verb Phrase Machine Translation System

Ajibade, Benjamin, Eludiora, Safiriyu

arXiv.org Artificial IntelligenceAug-5-2023

Despite the population of speakers, Yorùbá is still considered as a low The advancement in Natural language resource language (for which few language Processing (NLP) can be attributed to recent resources exist), making it very difficult for the improvements in the strategy and techniques of development of more advanced models such as the large data collection, archiving, analysis, and Neural Machine model that requires large volumes visualization. NLP began in the '50s as machine of data. With the number of speakers, translating translation (MT), intended to aid in code-breaking the language to other widely spoken languages was during World War II although the translations were not initially emphasized. However, recent not successful, these early stages of MT were linguistic researchers are taking up the challenges necessary stepping stones on the way to more by giving more attention (as compared to the highresource sophisticated technologies (Zhang, 2018; Quinn, language of the Western World).

artificial intelligence, natural language, translation, (13 more...)

arXiv.org Artificial Intelligence

2104.04125

Country:

Africa > Nigeria > Osun State > Ile-Ife (0.04)
Africa > West Africa (0.04)
Africa > Togo (0.04)
Africa > Benin (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

DiffCollage: Parallel Generation of Large Content with Diffusion Models

Zhang, Qinsheng, Song, Jiaming, Huang, Xun, Chen, Yongxin, Liu, Ming-Yu

arXiv.org Artificial IntelligenceMar-29-2023

We present DiffCollage, a compositional diffusion model that can generate large content by leveraging diffusion models trained on generating pieces of the large content. Our approach is based on a factor graph representation where each factor node represents a portion of the content and a variable node represents their overlap. This representation allows us to aggregate intermediate outputs from diffusion models defined on individual nodes to generate content of arbitrary size and shape in parallel without resorting to an autoregressive generation procedure. We apply DiffCollage to various tasks, including infinite image generation, panorama image generation, and long-duration text-guided motion generation. Extensive experimental results with a comparison to strong autoregressive baselines verify the effectiveness of our approach.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.17076

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback